1 



The Spatial Real and Virtual Sound Stimuli Optimization for 
the Auditory BCI 

Nozomu Nishikawa 1 , Yoshihiro Matsumoto 1 , Shoji Makino 1 and Tomasz M. Rutkowski 1 ' 2 '* 

1 Multimedia Laboratory, TARA Center, University of Tsukuba, Tsukuba, Japan 

2 RIKEN Brain Science Institute, Wako-shi, Japan 
* E-mail: tomek@tara.tsukuba.ac.jp 

Abstract 

The paper presents results from a project aiming to create horizontally distributed surround sound sources 
and virtual sound images as auditory BCI (aBCI) stimuli. The purpose is to create evoked brain wave 
response patterns depending on attended or ignored sound directions. We propose to use a modified 
version of the vector based amplitude panning (VBAP) approach to achieve the goal. The so created 
spatial sound stimulus system for the novel oddball aBCI paradigm allows us to create a multi-command 
experimental environment with very encouraging results reported in this paper. We also present results 
showing that a modulation of the sound image depth changes also the subject responses. Finally, we 
also compare the proposed virtual sound approach with the traditional one based on real sound sources 
generated from the real loudspeaker directions. The so obtained results confirm the hypothesis of the 
possibility to modulate independently the brain responses to spatial types and depths of sound sources 
which allows for the development of the novel multi-command aBCI. 

1 Introduction 

In recent years, many researchers who study the brain waves plug to the technological development of 
using electroencephalography (EEC) signals for engineering applications. The research field has made 
a remarkable development. It is a fact that thanks to such application the disabled people could move 
successfully a cursor or a wheelchair using their brain wave patterns only. Thanks to the BCI technology 
it is expected that disabled people will receive the various services only by thinking or interacting with 
their environments \U. 

The spatial auditory BCI (saBCI) is the interface that makes use of the brain waves captured in 
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response to spatial sound stimuli. In our research, we aim at the practical realization and enhancement 
of the saBCI |]. Figure [l] depicts our saBCI system prototype that performs the following: 

• It creates the spatial sound stimuli in the octagonal surround sound loudspeaker system using the 
modified VBAP approach as well as the conventional real sound sources; 

• It captures and analyzes the EEG event related responses (ERP) to spatial sound stimuli; 

• Finally it classifies the responses in order to generate the BCI application command (exemplary 
wheelchair in the diagram). This paper does not include yet classification results. 

In the presented project we refer to the fact that a human brain given a sound stimulus generates 
the electric ERP pattern [3j. Attentional modulation adds a characteristic deflection (so called "aha 
response") which could be utilized in the saBCI paradigm development. This ERP feature depending on 
whether the sound is attended (so called target) or not (non-target) allows for further classification of 
subject's choices [4|[5] . "The aha-response" to the attended target sound in comparison with the other 
non-target has the characteristic positive deflection after 300ms from the onset and thus it is called "the 
P300 response." An example of an averaged P300 response (solid blue line - target), from our experiments, 
is depicted in Figure [2] together with the ignored sound direction (dashed red line - non-target). The 
properly classified, by a machine learning application, P300 response lets the BCI application user to 
select a command. In this paper we only identify the responses separability and the future research will 
result with final classification. 

There are multiple optimization steps in a course of the saBCI system development. The first step 
is an optimal choice of the number of loudspeakers in order to create a multi-command system. In 
the state-of-the-are saBCI applications [5][6] the number of commands depends on the number of real 
loudspeakers, since they generate the sound sources. We aim to increase the number by employing a 
concept of a virtual sound sources or so called virtual sound images in our saBCI prototype. This will 
allow to decrease the number of the necessary loudspeakers or to increase a number of commands in the 
current setups. The virtual sound image is generated using multiple loudspeakers, not with a single real 
one. The VBAP method to generate virtual sound images is described in the following sections together 
with the proposed multi-command saBCI application. Additionally we propose to utilize sound depth 
panning to further possibly increase a number of commands in a limited loudspeaker number environment. 
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We conduct a series of psychophysical and offline saBCI EEG experiments in order to address the 
following questions: 

1 . Does the spatial sound stimuli delivery system influences the results? In detail, is there any signif- 
icant difference between real and virtual sound images? 

2. Does a type of sound stimuli timbre matter? 

3. Does the sound image depth (modulated volume in our case) influences the responses? 

The paper from now on is organized as follows. First we introduce spatial sound creation and delivery 
system. Next psychophysical and offline asBCI experiments are described. Finally results discussion 
concludes the paper. 



2 Methods 

This section describes the stimuli generation methods, EEG capture experimental protocol in offline 
saBCI mode, and the ERP processing steps in order to elucidate and prove separability of the brain P300 
responses for the future online multi-command application with a classifier. 

2.1 VBAP Method for Generating of Spatial Virtual Sound Sources in Ap- 
plication to saBCI 

In the proposed saBCI paradigm the user receives the real and virtual sound source stimuli that are 
created using a MAX environment |7J. The surround sound environment consist of the eight loudspeakers 
distributed octagonally at a single meter distance from the subject's head as depicted in Figure [l] 

We implement a custom VBAP [8] patch in MAX environment that controls the eight spatially dis- 
tributed loudspeakers in real time. The developed VBAP patch can be used to position a virtual sound 
source with any arbitrary loudspeaker configuration. The amplitude panning concept implemented in the 
VBAP method is formulated with vectors pointing the desired sound source position in the sound field. 
In our particular configuration of a horizontal plane defined by the octagonally distributed loudspeakers 
the virtual sound images could be created only the pairs of neighboring loudspeakers. The simplified 
pair-wise VBAP application reduces the sound amplitude panning to only two loudspeakers used at a 
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time. The proposed VBAP reformulation leads to simpler equations for amplitude panning, and the use 
of the vectors makes the panning computationally efficient. The VBAP implementation consists of three 
objects (functions) in MAX environment: deftneJoudspeakers, vbap, and matrix^ , as depicted in a flow 
digram in Figure [3| 

In our proposal of the simplified two-dimensional VBAP method the two-channel loudspeaker config- 
uration is reformulated as a two-dimensional vector base. The base is defined by unit length vectors 



h = [hx h y ] T and L 



2 — [hx hy] T , 



(1) 



which are pointing toward the loudspeaker #1 and #2, respectively, as depicted in Figure [4j The 
superscript T denotes the matrix transposition. The l ix and li y are the 1^ vector coordinates on x and y 
axes, respectively. The unit-length vector 



P = [Px Pyf, 



(2) 



pointing toward the desired virtual sound source can be treated as a linear combination of the two 
neighboring loudspeaker vectors as follows 



P = 9ih +92h, 



(3) 



where g\ and g 2 are the gain factors, which are the nonnegative scalar variables to be optimized by VBAP 
in order to position the virtual sound image. The equation ^ could be reformulated in a matrix form as 



P = (g L i 2 ) , 



(4) 



where g = [51 g 2 ] and L 12 = [h h] T ■ The equation Q4J> can be solved if L 12 1 exists, so the gain vectors 
could be obtained as follows 



g = P L 12 = [pi p 2 ] 



(5) 



The inverse matrix L 12 1 satisfies L^L-^ 1 = I, where I is the identity matrix. The gain factors g\ and g 2 
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calculated using equation ([5]). 



2.2 Increasing the Number of the saBCI Commands with the Spatial Depth 
Adjustment of the Sound Sources 

Until now we only discussed the spatial directional sound sources manipulation in order to create the 
multi-command saBCI paradigm. Human brain auditory system can also distinguish easily a change of 
sound source distance by comparing their loudnesses. We propose to test additionally how the changing 
sound distance or depth could help us to increase a number of possible saBCI commands. In the previously 
introduced VBAP method in equation Q the loudspeaker sound amplitudes are controlled with gain 
factors (?i and g%, respectively. In traditional VBAP implementation the sum of squares of the gain 
factors shall always obey 



However, the total sound loudness could be set to a constant value C, that would reflect changing sound 
source depth location as follows 



So a depth setting for c = 1 could be referred as a standard/reference setting. The smaller value of c, as 
in equation ([7]), will create a farther distance sound illusion. In our experiments discussed in the following 
sections we use settings of c = {0.2, 1} to create sound sources with loudnesses of approximately ~ 30dB 
or ~ 70dB, respectively, in our laboratory's soundproof studio. 

3 The Psychophysical Experiment with Eight Loudspeakers to 
Evaluate Spatial Sound Stimuli 

Sound localization performed by human subjects is limited by some features related to head and ear 
shapes. The most common localization problem is related to front/back sound source confusion, and 
consequently it is difficult for people to discriminate the exactly frontal or rear positions. In order 
to create and test the auditory stimuli for their difficulty levels we perform first the psychophysical 
experiment to measure the behavioral responses before the final EEG experiment in offline BCI setting. 



.9i + .92 = 1- 



(6) 



gl+g 2 2 =c 0<C<1. 



(7) 
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The spatial target directions resulting in later behavioral responses in form of button presses shall be 
classified as the more difficult or causing more cognitive load 13] . Also a number of mistakenly recognized 
targets in a random series in each trial identifies possible variable difficulty of the task. Our goal is to 
design a final spatial oddball paradigm which would require the same cognitive load for all directions in 
order to have similar P300 response easy to classify by machine learning algorithms later. 

In the series psychophysical experiments described below we compare the behavioral reaction times 
as modulated by spatial cognitive load difficulty in function of: 

• Various timbres of the sound stimuli; 

• Real or virtual sound images creation techniques (real loudspeakers or VBAP); 

• Various sound depth settings. 

In all psychophysical experiments the sound sources are generated with the eight loudspeakers of our 
system as depicted in Figure [5j 

The experiments were conducted with agreement of the local institutional ethical committee guide- 
lines. The subjects were explained the psychophysical and EEG experimental procedures in detail. All 
of them participated voluntarily. 

In the first experimental session, all the sound sources are generated from the real loudspeaker di- 
rections, thus this session is labeled as "the real." The second session utilizes the proposed simplified 
version of VBAP, as in equation Q , in order to create virtual sound sources at the positions between the 
neighboring loudspeakers. This second experimental session is labeled as "the virtual." For example in 
the virtual setting, a sound from a direction of 45° is created actually with the two loudspeakers placed 
at 0° and 90°, instead of the actual one located there in the real experiment. Such setting allows us to 
compare the real and virtual sound reproduction setting for exactly the same spatial directions. 

3.1 Psychophysical Experiment Protocol 

The procedure of the psychophysical experiment in a single trial is composed of the following steps: 

1. An instruction is given to the subject of which spatial sound stimuli direction and which source 
depth setting (loudness) to attended in form of a sound delivered and followed by a break. 
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2. Next the subject listens to the 16 randomly ordered directions with the same stimulus timbre played 
from each (each direction has two depth settings). 

3. The subject responses immediately (as fast as possible) by pressing a computer keyboard button 
after the target direction sounds occurs in the sequence, as instructed in the above step 1). 

4. The above procedure is repeated for all the 16 target directions - with single one in each trial. 

In order not to overlap the psychophysical responses (usually around 500ms for our subjects) with 
the following stimuli the stimuli onset asynchrony (SOA) is set to 1000ms with stimuli lasting for only 
the first 500ms. The followings are the experimental method to determine the difficulty of each spatial 
sound stimuli. After each trial consisting of 16 spatial sound sources (chance level = 1/16 = 6.25%) we 
evaluate subject's response correctness and the delay time. Also a comparison among the directions and 
depth levels is carried. 

Table [I] summarizes the details of the psychophysical experiment settings. 

We conduct the experiments for the three kinds of stimulus timbre in order to evaluate also the 
possible subject preferences. The results are discussed in the next section. 

3.2 Psychophysical Experiment Results 

The results of conducted psychophysical experiments are presented in Tables [2] [9| The analysis of sound 
depths variability, summarized in Tables [2j |4j [6j and [HJ resulted with the lowest and no significant 
differences of response times, as tested with Wilcoxon-pariwise-test for the means. This result supports 
the hypothesis of the possibility to utilize this feature for the saBCI paradigm design in real and virtual 
sound reproduction settings. 

On the other hand the results summarized in Tables|3j[5j for real sound sources, and in Tables [7J [9] for 
virtual ones, resulted also in no significant differences for directions based on the response time analyzes. 

However a comparison of accuracy rates (AR) for real and virtual sound sources in all Tables [2jj9] 
showed significant differences with much lower values for virtual. This shows that VBAP application still 
needs further research on the sound quality related to virtual image spatial spreads. The differences show 
that virtual sound images are more difficult to localize so they shall not be mixed with the real sounds 
in a single experimental trials in the offline saBCI experiments as presented in the next section. 
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4 The Offline saBCI Experiment 

Based on previously discussed very encouraging psychophysical experiment results showing no relation 
between subjects behavioral performance and spatial sound directions we proceed to the offline saBCI 
paradigm EEG recordings in which the brain responses will be measured to further test our original 
hypothesis of the possibility to utilize real and proposed virtual spatial sound sources in the oddball 
experimental paradigm. Now we verify whether the ERP responses show any differences for directions 
and reproduction system types. Since the previously conducted psychophysical experiment indicated no 
relation to sound timbres, in the following analysis of EEG responses we average all types. In order to 
do so we conduct the following experiments with the same six volunteers and with accordance of the 
institutional ethical committee guidelines. 

The EEG signals are captured with eight dry electrode portable wireless system by g.tec (g.MOBIIab+ 
& g. SAHARA). The EEG signals are sampled at 256Hz rate and stored on a laptop computer running 
custom MATLAB Simulink patch. The user interface and stimulus are created in MAX. 

4.1 The Offline saBCI EEG Experimental Protocol 

The offline saBCI experimental protocol for a single trial is as follows: 

1. First an auditory instruction is given to the subject to which spatial sound stimuli to attended. 

2. Next the subject listens to the 16 randomly ordered stimuli, but this time no any behavioral response 
is given and only counting of targets is advised to keep a focused attention. 

3. The above procedure is repeated for all target directions and volumes separately, resulting in 16 
trials for each timbre. 



The detailed experimental setting are summarized in Tabic 10 



4.2 EEG Signal Preprocessing and Analysis 

Two Butterworth S'^-order low- and high-pass filters are subsequently applied with cutoff frequencies at 
0.5Hz and 20Hz respectively to remove low frequency and DC-shift interferences. The low-pass filtering 
removes possible muscle movement frequency artifacts. The preprocessed EEG signals are next segmented 
to create event related epochs starting at —100ms from the stimulus onset and ending at 800ms after it. 
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4.3 EEG Preprocessing to Clean the Signals from Eye-Movements with Mul- 
tivariate Empirical Mode Decomposition 

There are many signal processing methods trying to remove eye- and muscle-movement interferences from 
EEG signals. The most successful approaches belong the new developments based on empirical mode 
decomposition (9lfTol , since they outperform ICA and blind source separation based approaches [Til. 



There exist already multiple signal processing approaches that extend the classical EMD 12 concept 
which constitutes the fully data adaptive technique to decompose any non-linear or non-stationery signal 
into a finite set of band- limited basis functions called intrinsic mode functions (IMFs). Each of the 
decomposed IMFs carry the both amplitude and frequency modulated oscillatory components. The very 
recent development in the field is the multivariate EMD (MEMD) [13], which is a more generalized 
extension of the classical univariate EMD technique. The most interesting part of that approach is that 
it can deal instantaneously with multivariate dataset, like EEG for example. The major novelty proposed 



13 is based on possibility to process the n-dimensional signals in the same number of spaces. As a 



result at each decomposition step the n-dimensional envelopes are generated by taking signal projections 
along different directions in the same dimensional spaces. Next, as in classical univariate EMD, the local 
mean is calculated as an approximation of the integral of all envelopes along multiple directions in the 



n-dimensional space. The authors in 13 very elegantly resolve a problem related to a lack of formal 
definition of maxima and minima in multidimensional domains by utilizing a sampling concept based on 
low discrepancy Hammersley sequence to generate the projections. In a next step, the extrema could be 
interpolated using cubic splines. The resulting signal envelopes are obtained from the projections along 
different directions in multidimensional spaces. They are finally averaged to obtain the local mean of the 
original multivariate EEG signal, as in our case. The whole MEMD decomposition procedure of an input 



multivariate signal s(t) into a multidimensional set of IMFs, as in 14 , could be summarized as follows: 

1. As the first step, generate the point-set based on the Hammersley sequence for an uniform sampling 
of the EEG on an (n — 1) — dimensional — sphere; 

2. In the second step, calculate a projection {p 6k (t)}f =1 of the input EEG {s(t)}f =1 along the direction 
vector X 6k , for all k (the whole set of direction vectors) resulting in {p 6k (t)}^ =1 set of projections; 

3. In the third step, find the time points {t^ k }j^ =1 representing the maxima in the set of projected 
signals {/K*)}f=i; 
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4. As the fourth step, perform interpolation [t^, s(t^ fe )], for all values of fc, in order to obtain the 
multivariate envelope curves {v k (t)}£ =1 ; 

5. In the fifth step, for a set of k direction vectors, calculate the mean m(t) of the envelope curves as: 



6. Finally, extract/sift the "detail" d(t) using d(t) — X(t) — m(t). Similarly as in classical univariate 
EMD approach, the "detail" d(t) shall fulfill the stoppage criterion for a multivariate IMF, so apply 
the above procedure to X{t) — d(t), otherwise apply it to d(t). 

Once the first multivariate IMF is identified, it is subtracted from the input signal and the same process 
is applied to the resulting signal yielding the second IMF, etc. In the multivariate case, similarly as in 
univariate one, the residue corresponds to a signal whose projections do not contain enough extrema to 
form a meaningful multivariate envelope. The stopping criterion in MEMD is also similar to standard 
EMD [12] one, only with a difference that the condition for equality of the number of extrema and zero 
crossings is not imposed, as extrema could not be properly defined for multivariate signals. The resulting 
from MEMD decomposition filter banks represent an array of band-pass filters designed to filter, with 
possible spectral overlap, the input multivariate signal into different frequency bands. The frequency 
bands are the same for all IMFs which is a very interesting feature of the MEMD method. In case of 
sequential implementation of classical univariate EMD, each channel would result with different number 
of IMFs, which would increase a complexity of artifacts identification. The MEMD decomposition allows 
for simple comparison of the adaptive subband filtering results and rejection only of the contaminated 
by eye/muscle-movement IMFs. 

For the eye/muscle-movement identification we apply an peak — to — peak thresholding over the the all 
IMFs. Those IMFs which exceed 20/iV peak-to-peak value are marked as "contaminated" and removed 
from the following reconstruction. The result of the procedure is shown in Figure [6j where the very 
strong eye-blinks, completely damaging the averaged ERPs, are successfully removed with the proposed 
method. 




(8) 



k = l 
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4.4 The Offline saBCI EEG Analysis Results 

The averaged results of the conducted EEG experiments with the six subjects in the offline saBCI mode 
are presented in Figures [7]-[T0] Each of those figures has eight panels comparing the averaged responses 
to targets (solid-blue lines) and to non-targets (dashed-red lines) for each EEG electrode used in our 
experiments. The presented ERPs are the grand averages for all subjects and all sound timbres used in 
our experiments. The comparison between the real (see Figures [7] and [8| and virtual (see Figures[9]and 10 1 
sound images show that strong P300 responses (marked by black dotted-line squares) could be obtained 
in both cases. This results show that the both real and virtual sound reproduction systems yield similar 
results of saBCI stimuli generation. Additionally we can see from the results with farther sound (smaller 
loudness) in Figures [8] and [To] that the P300 response seems to be stronger in amplitude comparing to 
closer distance (larger loudness) results in Figures [7] and [9] This very interesting phenomenon will be a 
target of our future research. The most important message from our current experimental session is that 
the proposed simplified VBAP application is a valid method to create the good saBCI stimuli. We could 
not observe any dependence of the spatial sound locations on P300 responses which is also a good result 
for future saBCI applications. 



5 Conclusions 

The purpose of the presented research was to check whether virtual sound images creation procedures 
were useful for the saBCI paradigm development. We have been researching a possibility to increase the 
possible number of BCI commands for future applications by keeping a limited number of the loudspeaker 
units. We conducted two experimental series to test the hypotheses of real and virtual images equality 
for saBCI paradigm design, sound depth manipulation, and whether the various timbres modulate the 
psychophysical and brain responses. 

The very small, yet not significant, differences in psychophysical experiments were later not reproduced 
in EEG experiments, in offline saBCI setting, which is a very good sign for the future developments 
in saBCI research field. The psychophysical experiments confirmed, based on the lack of significant 
differences of pairwise statistical tests, that the directions and sound reproduction systems (real or virtual) 
result with the same behavioral responses. 

The brain responses in offline saBCI setting further confirmed the original hypothesis of the similarity 
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of real and virtual sound settings. We reported also on the very interesting ERP modulation depending 
on sound depth (loudness) which resulted with slightly different, larger for the farther sound, P300 
responses. This topic will be further investigated in our future research. 

In the future, we will develop and online saBCI application based on the proposed simplified VBAP 
approach in order to create a more easy to use interfacing paradigm with possible larger and independent 
from the loudspeakers quantity the multiple commands. 

Additionally we plan to further expend the system to use fully three-dimensional sound field by 
adding elevation variability, which shall further improve and enhance the system capability. Vertical 
sound localization is a bit more difficult to reproduce, yet it shall improve the interfacing comfort. 

The presented results are a step forward in development of vision-free and auditory spatial brain 
computer interfacing paradigm which shall in the future support and improve comfort of life of locked-in 
patients. 
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Figure Legends 

Figure [l] Schematic drawing of the proposed auditory BCI system. The subject in the center is sur- 
rounded by horizontally distributed loudspeakers. EEG signals are processed by the BCI processing 
stages in synchrony with presented spatial auditory stimuli. 

Figure [2] An example of averaged brain evoked responses to target (solid blue line) and non-target 
(dotted red line). The difference in ERP waves is mostly pronounced around 300ms where usually 
P300/aha-response occurs. The zero second time point indicates stimulus onset. 

Figure [3] A schematic figure of the VBAP method implementation. 

Figure [4] A concept of the simplified horizontal panning of sound source with two loudspeakers based 
on the original VBAP [8] implementation. 

Figure [5] The locations of the eight loudspeakers in our experiments. The frontal to the subject loud- 
speaker at a position of 0° . 

Figure [6] An example of EMD ERP cleaning. The top panel depicts the original contaminated EEG 
ERP response with large drifts for both the in target (blue, solid line) and non-target responses 
(red, dashed line) . The cleaned version is shown in lower panel, where there ERP related peaks are 
clearly visible. 

Figure [7] EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with closer 
depth setting (~ 70dB). The P300-response latencies are marked with black-dotted line squares 
for each electrode. 

Figure [8] EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with farther 
depth setting (~ 30dB). The P300-response latencies are marked with black-dotted line squares 
for each electrode. 

Figure [9] EEG evoked responses to virtual sound stimuli using VBAP. The blue-solid lines represent 
the averaged responses to the target sounds, while the red-dashed lines to the averaged non-targets 
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with farther depth setting (~ 70dB). The P300-response latencies are marked with black-dotted 
line squares for each electrode. 



gure 10 EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with farther 
depth setting (~ 30dB). The P300-response latencies are marked with black-dotted line squares 
for each electrode. 
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Tables 

Table 1. The psychophysical experiment conditions. 



I Anrlifi An 

^OIlUlllOIl 


rdl dint; I til b 


Sound stimuli 


white noise, MIDI sound, effect sound 


Sound source 


eight channel loudspeaker system 


Response input 


computer keyboard 


Sound directions 


8 

(-135°, -90°, -45°, 0°, 45°, 90°, 135°, 180°) 


Volume levels 


- 30dB and - 70dB 


Subjects 


6 


Ttrials 


12 (2 trials x 3 stimuli x 2 real/virtual-sounds) 


Stimulus length 


500ms 


SOA 


1000ms 



Table 2. Accuracy rate of psychophysical response to real sound stimuli with the nearer sound source 
setting (~ 70dB) as a function of various stimuli (AR = accuracy rate, ART = average response time, a 
= standard deviation). 



Stimulus type 


AR [%} 


ART [ms] 


a [ms] 


White noise 


91.0 


574 


103 


MIDsound 


93.2 


571 


102 


Effect sound 


82.0 


570 


104 
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Table 3. Accuracy rate of psychophysical response to real sound stimuli with the nearer sound source 
setting (~ 70dB) as a function of the spatial direction (AR = accuracy rate, ART = average response 
time, a = standard deviation). 



Sound directions 


AR [%] 


ART [ms] 


a [ms] 


-135° 


94.4 


603 


91.6 


-90° 


91.0 


573 


98.4 


-45° 


94.4 


554 


79.5 


0° 


91.6 


535 


94.2 


45° 


100.0 


547 


88.9 


90° 


88.8 


582 


95.7 


135° 


88.8 


590 


102.0 


180° 


86.1 


537 


128.0 



Table 4. Accuracy rate of psychophysical response to real sound stimuli with the nearer sound source 
setting (~ 30dB) as a function of various stimuli (AR = accuracy rate, ART = average response time, a 
= standard deviation). 



Stimulus type 


AR [%} 


ART [ms] 


a [ms] 


White noise 


90.6 


582 


117 


MIDsound 


91.7 


539 


105 


Effect sound 


75.0 


562 


99.4 
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Table 5. Accuracy rate of psychophysical response to real sound stimuli with the nearer sound source 
setting (~ 30dB) as a function of the spatial direction (AR = accuracy rate, ART = average response 
time, a = standard deviation). 



Sound directions 


AR [%] 


ART [ms] 


a [ms] 


-135° 


91.6 


562 


113 


-90° 


91.6 


531 


90.2 


-45° 


83.3 


553 


84.7 


0° 


80.5 


541 


115 


45° 


88.8 


531 


92.9 


90° 


77.7 


589 


98.0 


135° 


88.8 


617 


148 


180° 


83.3 


562 


98.9 



Table 6. Accuracy rate of psychophysical response to virtual sound stimuli with the nearer sound 
source setting (~ 70dB) as a function of various stimuli (AR = accuracy rate, ART = average response 
time, a = standard deviation). 



Stimulus type 


AR [%} 


ART [ms] 


a [ms] 


White noise 


54.6 


612 


115 


MIDsound 


44.7 


571 


115 


Effect sound 


47.3 


623 


113 
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Table 7. Accuracy rate of psychophysical response to virtual sound stimuli with the nearer sound 
source setting (~ 70dB) as a function of the spatial direction (AR = accuracy rate, ART = average 
response time, a = standard deviation). 



Sound directions 


AR [%] 


ART [ms] 


a [ms] 


-135° 


67.2 


630 


135 


-90° 


47.7 


641 


92 


-45° 


49.4 


595 


149 


0° 


56.6 


583 


103 


45° 


43.8 


634 


125 


90° 


44.4 


619 


106 


135° 


45.0 


605 


104 


180° 


71.6 


611 


142 



Table 8. Accuracy rate of psychophysical response to virtual sound stimuli with the nearer sound 
source setting (~ 30dB) as a function of various stimuli (AR = accuracy rate, ART = average response 
time, a = standard deviation). 



Stimulus type 


AR [%} 


ART [ms] 


a [ms] 


White noise 


46.9 


603 


109 


MIDsound 


44.3 


607 


132 


Effect sound 


45.6 


625 


90.9 
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Table 9. Accuracy rate of psychophysical response to virtual sound stimuli with the nearer sound 
source setting (~ 30dB) as a function of the spatial direction (AR = accuracy rate, ART = average 
response time, a = standard deviation). 



Sound directions 


AR [%] 


ART [ms] 


a [ms] 


-135° 


40.0 


603 


78.1 


-90° 


49.4 


605 


108 


-45° 


51.6 


597 


127 


0° 


37.7 


586 


121 


45° 


37.2 


668 


98.9 


90° 


40.5 


607 


111 


135° 


42.2 


648 


126 


180° 


65.5 


620 


82.1 



Table 10. The offline saBCI EEG experimental conditions. 



Condition 


Parameters 


Sound stimuli 


white noise, MIDI sound, effect sound 


Sound source 


eight channel loudspeaker system 


Response input 


8-electrodes EEG system 


Sound directions 


8 

(-135°, -90°, -45°, 0°, 45°, 90°, 135°, 180°) 


Volume levels 


- 30dB and - 70dB 


Subjects 


6 


Ttrials 


12 (2 trials x 3 stimuli x 2 real/virtual-sounds) 


Stimulus length 


500ms 


SOA 


1000ms 
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Figure 1. Schematic drawing of the proposed auditory BCI system. The subject in the center is 
surrounded by horizontally distributed loudspeakers. EEG signals are processed by the BCI processing 
stages in synchrony with presented spatial auditory stimuli. 




Figure 2. An example of averaged brain evoked responses to target (solid blue line) and non-target 
(dotted red line). The difference in ERP waves is mostly pronounced around 300ms where usually 
P300/aha-response occurs. The zero second time point indicates stimulus onset. 
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Input: 

Location Information 
of loudspeakers 



"definejoudspeakers": 
Calculate to form a data set 



Spreading control 



"vbap" 

Calculate the gain factors of 
the loudspeakers 



"matrix-" : 

Pan the audio signals to loudspeakers 



"Is-delay" : 

Compensate different loudspeaker distance 



Output : 
Audio signals 
of loudspeakers 



Figure 3. A schematic figure of the VBAP method implementation. 
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Figure 4. A concept of the simplified horizontal panning of sound source with two loudspeakers based 
on the original VBAP |8| implementation. 




Figure 5. The locations of the 
loudspeaker at a position of 0°. 



eight 



loudspeakers in our experiments 



The frontal to the subject 
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The original contaminated ERP responses 




time after the stimuli offset [s] 




time after the stimuli offset [s] 



Figure 6. An example of EMD ERP cleaning. The top panel depicts the original contaminated EEG 
ERP response with large drifts for both the in target (blue, solid line) and non-target responses (red, 
dashed line). The cleaned version is shown in lower panel, where there ERP related peaks are clearly 
visible. 
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Figure 7. EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with closer depth 
setting (~ 70dB). The P300-response latencies are marked with black-dotted line squares for each 
electrode. 
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Figure 8. EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with farther 
depth setting (~ 30dB). The P300-response latencies are marked with black-dotted line squares for 
each electrode. 
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Figure 9. EEG evoked responses to virtual sound stimuli using VBAP. The blue-solid lines represent 
the averaged responses to the target sounds, while the red-dashed lines to the averaged non-targets 
with farther depth setting (~ 70dB). The P300-response latencies are marked with black-dotted line 
squares for each electrode. 
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Figure 10. EEG evoked responses to real sound stimuli. The blue-solid lines represent the averaged 
responses to the target sounds, while the red-dashed lines to the averaged non-targets with farther 
depth setting (~ 30dB). The P300-response latencies are marked with black-dotted line squares for 
each electrode. 



